PC Magazine's BASIC Techniques and Utilities by Ethan Winer In memory of my father, Dr. Frank Winer TABLE OF CONTENTS INTRODUCTION Part I: UNDER THE HOOD Chapter 1. An Introduction to Compiled BASIC Chapter 2. Variables and Constant Data Chapter 3. Programming Methods Part II: HANDS-ON PROGRAMMING Chapter 4. Debugging Strategies Chapter 5. Compiling and Linking Chapter 6. File and Device Handling Chapter 7. Database and Network Programming Chapter 8. Sorting and Searching Part II: BEYOND BASIC Chapter 9. Program Optimization Chapter 10. Key Memory Areas in the PC Chapter 11. Accessing DOS and BIOS Services Chapter 12. Assembly Language Programming ACKNOWLEDGEMENTS Many people helped me during the preparation of this book. First and foremost I want to thank my publisher, Cindy Hudson, for her outstanding support and encouragement, and for making it all happen. I also want to thank "black belt" editor Deborah Craig for a truly outstanding job. Never had I seen someone reduce a sentence from 24 words to less than half that, and improve the meaning in the process. [Unfortunately, readers of this disk version are unable to benefit from Deborah's excellent work.] Several other people deserve praise as well: Don Malin for his programming advice, and for eliminating all my GOTO statements. Jonathan Zuck for his contribution on database and network programming, including all of the dBASE file access routines. Paul Passarelli for unraveling the mysteries of floating point emulation, and for sharing that expertise with me. Philip Martin Valley for his research and examples showing how to read Lotus 1-2-3 binary files. Jim Mack for his skillful proof-reading of my manuscript, and countless good ideas. My wife Elli for her support and encouragement during the eight long months it took to write this book. ABOUT THE AUTHOR Ethan Winer is the founder of Crescent Software, Inc. located Ridgefield Connecticut. He has programmed in BASIC and assembly language since 1980, and is the author of Crescent's QuickPak Professional and P.D.Q. products. Ethan has also served as a contributing editor for both PC Magazine and BASICPro (now Visual Basic Programmer's Journal), and has written numerous feature articles for other popular computer publications. In 1992 Ethan retired from writing software professionally, and now spends his free time writing and performing music. PREFACE INTRODUCTION ============ BASIC has always been the most popular language for personal computers. It is easy to learn and use, extremely powerful, and some form of BASIC is included for free with nearly every PC. Although BASIC is often associated with beginners and students, it is in fact ideally suited for a wide range of programming projects. Because it offers the best features of a high- level language coupled with direct access to DOS and BIOS system services, BASIC is fast becoming the language of choice for beginners and professional developers alike. This book is about power programming using Microsoft compiled BASIC. It is intended for people who already possess a fundamental understanding of BASIC programming concepts, but want to achieve the best performance possible from their BASIC compiler. Power programming is knowing when and how to use BASIC commands such as CALL INTERRUPT, VARSEG and VARPTR, and even PEEK and POKE effectively. It involves understanding the PC's memory organization sufficiently to determine how much stack space is needed for a recursive subprogram or function. A power programmer knows how to translate a time-critical portion of a BASIC program into assembly language when needed. Finally, and perhaps most importantly, power programming means knowing enough about BASIC's internal operation to determine which sequence of instructions is smaller or faster than another. This book will show you how to go beyond creating programs that merely work. Because it explains how the compiler operates and how it interacts with the BASIC runtime language library, this book will teach you how to write programs that are as small and fast as possible. Although the emphasis here is on Microsoft QuickBASIC and the BASIC Professional Development System (PDS), much of the information will apply to other BASIC compilers such as Power Basic from Spectra Publishing. Despite what you may have read, BASIC is the most capable and easy to learn of the high-level languages. Modern BASIC compilers are highly optimizing, and can thus create extremely efficient executable programs. In addition, you can often achieve with just a few BASIC statements what would take many pages of code in another high-level language. Moreover, beginners can be immediately productive in BASIC, while serious programmers have a wealth of powerful capabilities at their disposal. Microsoft BASIC has many capabilities that are not available in any other high-level language. Among these are dynamic (variable-length) strings, automatic memory allocation and heap management, built-in support for sophisticated graphics, and interrupt-driven communications. Add to that huge arrays, network file handling, music and sound, and protection against inadvertently overwriting memory, and you can see why BASIC is so popular. This book aims to provide intermediate to advanced programmers with information that is not available elsewhere. It does not, however, cover elementary topics such as navigating the QuickBASIC editor, loading and saving files, or using the Search and Replace feature. That information is readily available elsewhere. Rather, it delves into previously uncharted territory, and examines compiled BASIC at its innermost layer. Besides the discussions and programs in the text, this book includes a companion disk [separate ZIP file] that contains all of the subroutines and other code listed in this book, including several useful utilities. Installing these programs is described in the Appendix. CONVENTIONS USED IN THIS BOOK ============================= This book uses the terms QuickBASIC and QB to mean the Microsoft QuickBASIC 4.x and 7.x editing environments. BC and Compiler indicate the BC.EXE command-line compiler that comes with QuickBASIC, Microsoft BASIC PDS, and the now-discontinued BASIC 6.0. When a distinction is necessary, QBX will refer to the QuickBASIC Extended editor that comes with the BASIC Professional Development System (PDS). In most cases, the discussions will be the same for all of these versions of BASIC. When a difference does occur, the PDS and QBX exceptions will be indicated. [Because there is no way to indicate italics in a disk file, where they would have been used for emphasis or clarity the words are instead surrounded by asterisks (*).] HOW THIS BOOK IS ORGANIZED ========================== This book is divided into parts, and each part contains several chapters that discuss a specific aspect of BASIC programming. You needn't fully understand an entire chapter before moving on to the next one. Each topic will be covered in great depth, and in many cases you will want to return to a given chapter as your knowledge and understanding of the subject matter increases. Part 1 is "Under the Hood," and its three chapters describe in detail how your BASIC source code is manipulated throughout the compiling and linking process. Chapter 1 presents an overview of compilers in general, and BASIC compilers in particular, It discusses what BASIC compilers are all about and how they work, and how the compiled code that is generated interacts with routines in the runtime libraries. Chapter 2 discusses variables, constants, and other program data, and how they fit within the context of the PC's memory organization. This chapter also covers bit manipulation using AND, OR, and XOR. Chapter 3 examines the various control flow methods available in BASIC, showing which statements and procedure constructs are appropriate in different situations. In particular, you will learn the relative advantages and disadvantages of each method, based on their capabilities, code size, and speed. Part 2, "Programming Hands On," examines programming techniques, and shows specific examples of writing effective code and also making it work. Chapter 4 explores program debugging using the facilities built into the QuickBASIC editing environment, as well as the CodeView utility that comes with Microsoft BASIC PDS. This chapter also discusses common programming problems, along with the appropriate solutions. Chapter 5 explains compiling and linking, both from within the QB environment, and directly from DOS. A number of compiler options are inadequately documented by Microsoft, and each is discussed here in great detail. A thorough discussion of the LIB.EXE utility program included with BASIC explains how libraries are manipulated and organized. Chapter 6 covers all aspects of file and device handling, and discusses the many different ways in which data may be read and written. The emphasis here is on speeding file handling as much as possible, and storing data on disk efficiently. Because input/output (I/O) devices are accessed similarly, they too are described here in detail. Chapter 7 explains the basics of writing database and network applications, and discusses file locking strategies using practical programming examples. A series of subroutines show how to read and write files using the popular dBASE format, and these may be incorporated into program that you write. Chapter 9 shows how to sort and search array data as quickly as possible. Several methods are examined including conventional and indexed sorting, and many useful subroutines are presented. The final part, "Beyond BASIC," includes information that is rarely covered in books about BASIC. Its three chapters go far beyond the information provided in any of the Microsoft manuals. Chapter 10 identifies many of the key memory areas in the PC, and shows when and how they can be manipulated in a BASIC program. Chapter 11 presents an in-depth discussion of accessing DOS and BIOS services using CALL INTERRUPT. These services offer a wealth of functionality that BASIC cannot otherwise provide directly. Chapter 12 is an introduction to assembly language, from a BASIC programmer's perspective. This chapter presents many useful subroutines, and includes a thorough discussion of how they work. Finally, the Appendix describes the additional source files that accompany this book. A BRIEF HISTORY OF MICROSOFT COMPILED BASIC =========================================== In March of 1982, IBM released the first BASIC compiler for the IBM PC. This compiler, BASCOM 1.0, was written by Microsoft for IBM using code and methods developed by Bill Gates, Greg Whitten, and others. Although Microsoft had already written BASIC compilers for the Apple II and CP/M computers, BASCOM 1.0 was the most powerful they had produced so far. Compared to the Microsoft BASIC interpreters available at that time, BASCOM 1.0 offered many additional capabilities, and also an enormous increase in program execution speed. Line numbers were no longer mandatory, program statements could exceed 255 characters, and a single string could be as long as 32,767 characters. Further, assembly language subroutines could be linked directly to a compiled BASIC application. Over the next few years, Microsoft continued to enhance the compiler, and in 1985 it was released by IBM as BASCOM 2.0. This version offered many improvements over the older BASCOM 1.0. Among the most important were multi-line DEF FN functions, dynamic arrays, descriptive line labels (as opposed to numbers), network record locking, and an ISAM file handler. With named subroutines programmers were finally able to exceed the 64K code size limitation, by writing separate modules that could then be linked together. The inclusion of subroutine parameters--long overdue for BASIC- -was an equally important step toward fostering structured programming techniques in the language. At the same time that IBM released BASCOM 2.0, Microsoft offered essentially the same product as QuickBASIC 1.0, but without the ISAM file handler. However, there was one other big difference between these compilers: QuickBASIC 1.0 carried a list price of only $99. This low price was perhaps the most important feature of all, because high-performance BASIC was finally available to everyone, and not just professional developers. Encouraged by the tremendous acceptance of QuickBASIC 1.0, Microsoft quickly followed that with QuickBASIC version 2.0 in early 1986. This important new release added an integrated editing environment, as well as EGA graphics capabilities. The editor was especially welcome, because it allowed programs to be developed and tested very rapidly. The environment was further enhanced with the advent of Quick Libraries, which allowed assembly language subroutines to be easily added to a BASIC program. Quick Libraries also helped launch the start of a new class of BASIC product: third-party add-on libraries. In early 1987 Microsoft released the next major enhancement to QuickBASIC, version 3.0. QuickBASIC 3.0 included a limited form of step and trace debugging, as well as the ability to monitor a variable's value continuously during program execution. Also added was support for the EGA's 43-line text mode, and several new language features. Perhaps most impressive of the new features was the control flow statements DO and LOOP, and SELECT CASE. Beyond merely providing a useful alternative to the IF statement, these constructs also let the compiler generate more efficient code. Also added with version 3.0 was optional support for an 8087 numeric coprocessor. In order to support a coprocessor, however, Microsoft had to abandon their own proprietary numeric format. Both the Microsoft and IEEE methods for storing single- and double precision numbers use four bytes and eight bytes respectively, but the bits are organized differently. Although the IEEE format which the 8087 requires is substantially slower than Microsoft's own, it is the current standard. Therefore, a second version of the compiler was included solely to support IEEE math. By the time QuickBASIC 4.0 was announced in late 1987, hundreds of thousands of copies of QuickBASIC were already in use world-wide. With QuickBASIC 4.0, Microsoft had created the most sophisticated programming environment ever seen in a main-stream language: the threaded p-code interpreter. This remarkable technology allowed programmers to enjoy the best features of an interpreted language, but with the execution speed of a compiler. In addition to an Immediate mode whereby program statements could be executed one by one, QuickBASIC 4.0 also supported program break-points, monitoring the value of multiple variables and expressions, and even stepping *backwards* through a program. This greatly enhanced the debugging capabilities of the language, and increased programmer productivity enormously. Also new in QuickBASIC 4.0 was support for inter-language calling. Although this meant that a program written in Microsoft BASIC could now call subroutines written in any of the other Microsoft languages, it also meant that IEEE math was no longer an option--it became mandatory. When a QuickBASIC 4.0 program was run on a PC equipped with a coprocessor, floating point math was performed very quickly indeed. However, it was very much slower on every other computer! This remained a sore point for many BASIC programmers, until Microsoft introduced BASIC 6.0 later that year. That version included an alternate math library that was similar to their original proprietary format. Also added in QuickBASIC 4.0 were huge arrays, long (4-byte) integer variables, user-defined TYPE variables, fixed-length strings, true functions, and support for CodeView debugging. With the introduction of huge arrays, BASIC programmers could create arrays that exceeded 64K in size, with only a few restrictions. TYPE variables let the programmer define a composite data type comprised of any mix of BASIC's intrinsic data forms, thus adding structure to a program's data as well as to its code. The newly added FUNCTION procedures greatly improved on BASIC's earlier DEF FN-style functions by allowing recursion, the passing of TYPE variables and entire arrays as arguments, and the ability to modify an incoming parameter. Although BASIC 6.0 provided essentially the same environment and compiler as QuickBASIC 4.0, it also included the ability to create programs that could be run under OS/2. Other features of this release were a utility program to create custom run-time libraries, and a copy of the Microsoft Programmer's Editor. The custom run-time utility was particularly valuable, since it allowed programmers to combine frequently- used subroutines with the BRUN.EXE language library, and then share those routines among any number of chained modules. QuickBASIC 4.5 was introduced in 1988, although the only major enhancement over the earlier 4.0 version was a new help system and slightly improved pull-down menus. Unfortunately, the new menus required much more memory than QuickBASIC 4.0, and the "improved" environment reduced the memory available for programs and data by approximately 40K. To this day, many programmers continue to use QuickBASIC 4.0 precisely because of its increased program capacity. In answer to programmer's demands for more string memory and smaller, more efficient programs, Microsoft released the BASIC Professional Development System version 7.0 in late 1989. This was an enormous project even for a company the size of Microsoft, and at one point more than fifty programmers were working on the new compiler and QBX environment. PDS version 7.0 finally let BASIC programmers exceed the usual 64K string memory limit, albeit with some limitations. Other features introduced with that version were an ISAM file handler, improved library granularity, example tool box packages for creating simple graphics and pull-down menus, local error handling, arrays within TYPE variables, and greatly improved documentation. Because the QBX editor uses expanded memory to store subprograms and functions, much larger programs could be developed without resorting to editing and compiling outside of the environment. Sixth months later PDS version 7.1 was released, with the long-overdue ability to redimension an array but without destroying its contents. Also added in that version were support for passing fixed-length string arrays to subprograms and functions, and an option to pass parameters by value to BASIC procedures. Although the BYVAL option had been available since QuickBASIC 4.0, it was useable only with subroutines written in non-BASIC languages. With this mechanism, BASIC can now create more efficient object code than ever before. [Just as this book was being completed, Microsoft released Visual Basic for DOS. Although this book does not address VB/DOS specifically, most of the information about BASIC PDS applies to VB/DOS. One notable exception is that VB/DOS supports far strings only, where BASIC PDS lets you specify either near strings or far. Because far strings are stored in a separate "far" area of DOS memory, it takes slightly longer to access those strings. Therefore, a VB/DOS program that is string-intensive will not be as fast as an equivalent compiled with QuickBASIC or with PDS near strings. This book also does not cover the pseudo event-driven forms used by VB/DOS.]